420 research outputs found

    Genome-Wide Analysis of Nucleotide-Level Variation in Commonly Used Saccharomyces cerevisiae Strains

    Get PDF
    Ten years have passed since the genome of Saccharomyces cerevisiae–more precisely, the S288c strain–was completely sequenced. However, experimental work in yeast is commonly performed using strains that are of unknown genetic relationship to S288c. Here, we characterized the nucleotide-level similarity between S288c and seven commonly used lab strains (A364A, W303, FL100, CEN.PK, ∑1278b, SK1 and BY4716) using 25mer oligonucleotide microarrays that provide complete and redundant coverage of the ∼12 Mb Saccharomyces cerevisiae genome. Using these data, we assessed the frequency and distribution of nucleotide variation in comparison to the sequenced reference genome. These data allow us to infer the relationships between experimentally important strains of yeast and provide insight for experimental designs that are sensitive to sequence variation. We propose a rational approach for near complete sequencing of strains related to the reference using these data and directed re-sequencing. These data and new visualization tools are accessible online in a new resource: the Yeast SNPs Browser (YSB; http://gbrowse.princeton.edu/cgi-bin/gbrowse/yeast_strains_snps) that is available to all researchers

    Lethality and centrality in protein networks

    Full text link
    In this paper we present the first mathematical analysis of the protein interaction network found in the yeast, S. cerevisiae. We show that, (a) the identified protein network display a characteristic scale-free topology that demonstrate striking similarity to the inherent organization of metabolic networks in particular, and to that of robust and error-tolerant networks in general. (b) the likelihood that deletion of an individual gene product will prove lethal for the yeast cell clearly correlates with the number of interactions the protein has, meaning that highly-connected proteins are more likely to prove essential than proteins with low number of links to other proteins. These results suggest that a scale-free architecture is a generic property of cellular networks attributable to universal self-organizing principles of robust and error-tolerant networks and that will likely to represent a generic topology for protein-protein interactions.Comment: See also http:/www.nd.edu/~networks and http:/www.nd.edu/~networks/cel

    MCL-CAw: A refinement of MCL for detecting yeast complexes from weighted PPI networks by incorporating core-attachment structure

    Get PDF
    Abstract Background The reconstruction of protein complexes from the physical interactome of organisms serves as a building block towards understanding the higher level organization of the cell. Over the past few years, several independent high-throughput experiments have helped to catalogue enormous amount of physical protein interaction data from organisms such as yeast. However, these individual datasets show lack of correlation with each other and also contain substantial number of false positives (noise). Over these years, several affinity scoring schemes have also been devised to improve the qualities of these datasets. Therefore, the challenge now is to detect meaningful as well as novel complexes from protein interaction (PPI) networks derived by combining datasets from multiple sources and by making use of these affinity scoring schemes. In the attempt towards tackling this challenge, the Markov Clustering algorithm (MCL) has proved to be a popular and reasonably successful method, mainly due to its scalability, robustness, and ability to work on scored (weighted) networks. However, MCL produces many noisy clusters, which either do not match known complexes or have additional proteins that reduce the accuracies of correctly predicted complexes. Results Inspired by recent experimental observations by Gavin and colleagues on the modularity structure in yeast complexes and the distinctive properties of "core" and "attachment" proteins, we develop a core-attachment based refinement method coupled to MCL for reconstruction of yeast complexes from scored (weighted) PPI networks. We combine physical interactions from two recent "pull-down" experiments to generate an unscored PPI network. We then score this network using available affinity scoring schemes to generate multiple scored PPI networks. The evaluation of our method (called MCL-CAw) on these networks shows that: (i) MCL-CAw derives larger number of yeast complexes and with better accuracies than MCL, particularly in the presence of natural noise; (ii) Affinity scoring can effectively reduce the impact of noise on MCL-CAw and thereby improve the quality (precision and recall) of its predicted complexes; (iii) MCL-CAw responds well to most available scoring schemes. We discuss several instances where MCL-CAw was successful in deriving meaningful complexes, and where it missed a few proteins or whole complexes due to affinity scoring of the networks. We compare MCL-CAw with several recent complex detection algorithms on unscored and scored networks, and assess the relative performance of the algorithms on these networks. Further, we study the impact of augmenting physical datasets with computationally inferred interactions for complex detection. Finally, we analyse the essentiality of proteins within predicted complexes to understand a possible correlation between protein essentiality and their ability to form complexes. Conclusions We demonstrate that core-attachment based refinement in MCL-CAw improves the predictions of MCL on yeast PPI networks. We show that affinity scoring improves the performance of MCL-CAw.http://deepblue.lib.umich.edu/bitstream/2027.42/78256/1/1471-2105-11-504.xmlhttp://deepblue.lib.umich.edu/bitstream/2027.42/78256/2/1471-2105-11-504-S1.PDFhttp://deepblue.lib.umich.edu/bitstream/2027.42/78256/3/1471-2105-11-504-S2.ZIPhttp://deepblue.lib.umich.edu/bitstream/2027.42/78256/4/1471-2105-11-504.pdfPeer Reviewe

    Bulk Segregant Analysis Using Single Nucleotide Polymorphism Microarrays

    Get PDF
    Bulk segregant analysis (BSA) using microarrays, and extreme array mapping (XAM) have recently been used to rapidly identify genomic regions associated with phenotypes in multiple species. These experiments, however, require the identification of single feature polymorphisms (SFP) between the cross parents for each new combination of genotypes, which raises the cost of experiments. The availability of the genomic polymorphism data in Arabidopsis thaliana, coupled with the efficient designs of Single Nucleotide Polymorphism (SNP) genotyping arrays removes the requirement for SFP detection and lowers the per array cost, thereby lowering the overall cost per experiment. To demonstrate that these approaches would be functional on SNP arrays and determine confidence intervals, we analyzed hybridizations of natural accessions to the Arabidopsis ATSNPTILE array and simulated BSA or XAM given a variety of gene models, populations, and bulk selection parameters. Our results show a striking degree of correlation between the genotyping output of both methods, which suggests that the benefit of SFP genotyping in context of BSA can be had with the cheaper, more efficient SNP arrays. As a final proof of concept, we hybridized the DNA from bulks of an F2 mapping population of a Sulfur and Selenium ionomics mutant to both the Arabidopsis ATTILE1R and ATSNPTILE arrays, which produced almost identical results. We have produced R scripts that prompt the user for the required parameters and perform the BSA analysis using the ATSNPTILE1 array and have provided them as supplemental data files

    Short Co-occurring Polypeptide Regions Can Predict Global Protein Interaction Maps

    Get PDF
    A goal of the post-genomics era has been to elucidate a detailed global map of protein-protein interactions (PPIs) within a cell. Here, we show that the presence of co-occurring short polypeptide sequences between interacting protein partners appears to be conserved across different organisms. We present an algorithm to automatically generate PPI prediction method parameters for various organisms and illustrate that global PPIs can be predicted from previously reported PPIs within the same or a different organism using protein primary sequences. The PPI prediction code is further accelerated through the use of parallel multi-core programming, which improves its usability for large scale or proteome-wide PPI prediction. We predict and analyze hundreds of novel human PPIs, experimentally confirm protein functions and importantly predict the first genome-wide PPI maps for S. pombe (∼9,000 PPIs) and C. elegans (∼37,500 PPIs)

    Reduced body weight is a common effect of gene knockout in mice

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>During a search for obesity candidate genes in a small region of the mouse genome, we noticed that many genes when knocked out influence body weight. To determine whether this was a general feature of gene knockout or a chance occurrence, we surveyed the Jackson Laboratory Mouse Genome Database for knockout mouse strains and their phenotypes. Body weights were not available for all strains so we also obtained body weight information by contacting a random sample of investigators responsible for a knockout strain.</p> <p>Results</p> <p>We classified each knockout mouse strain as (1) lighter and smaller, (2) larger and heavier, or (3) the same weight, relative to control mice. We excluded knockout strains that died early in life, even though this type of lethality is often associated with a small embryo or reduced body size. Based on a dataset of 1,977 knockout strains, we found that that 31% of viable knockout mouse strains weighed less and an additional 3% weighed more than did controls.</p> <p>Conclusion</p> <p>Body weight is potentially a latent variable in about a third of experiments that use knockout mice and should be considered in interpreting experimental outcomes, e.g., in studies of hypertension, drug and hormone metabolism, organ development, cell proliferation and apoptosis, digestion, heart rate, or atherosclerosis. If we assume that the knockout genes we surveyed are representative then upward of 6,000 genes are predicted to influence the size of a mouse. Body weight is highly heritable, and numerous quantitative trait loci have been mapped in mice, but "multigenic" is an insufficient term for the thousands of loci that could contribute to this complex trait.</p

    Cmr1/WDR76 defines a nuclear genotoxic stress body linking genome integrity and protein quality control

    Get PDF
    DNA replication stress is a source of genomic instability. Here we identify ​changed mutation rate 1 (​Cmr1) as a factor involved in the response to DNA replication stress in Saccharomyces cerevisiae and show that ​Cmr1—together with ​Mrc1/​Claspin, ​Pph3, the chaperonin containing ​TCP1 (CCT) and 25 other proteins—define a novel intranuclear quality control compartment (INQ) that sequesters misfolded, ubiquitylated and sumoylated proteins in response to genotoxic stress. The diversity of proteins that localize to INQ indicates that other biological processes such as cell cycle progression, chromatin and mitotic spindle organization may also be regulated through INQ. Similar to ​Cmr1, its human orthologue ​WDR76 responds to proteasome inhibition and DNA damage by relocalizing to nuclear foci and physically associating with CCT, suggesting an evolutionarily conserved biological function. We propose that ​Cmr1/​WDR76 plays a role in the recovery from genotoxic stress through regulation of the turnover of sumoylated and phosphorylated proteins

    Discovery and Expansion of Gene Modules by Seeking Isolated Groups in a Random Graph Process

    Get PDF
    BACKGROUND: A central problem in systems biology research is the identification and extension of biological modules-groups of genes or proteins participating in a common cellular process or physical complex. As a result, there is a persistent need for practical, principled methods to infer the modular organization of genes from genome-scale data. RESULTS: We introduce a novel approach for the identification of modules based on the persistence of isolated gene groups within an evolving graph process. First, the underlying genomic data is summarized in the form of ranked gene-gene relationships, thereby accommodating studies that quantify the relevant biological relationship directly or indirectly. Then, the observed gene-gene relationship ranks are viewed as the outcome of a random graph process and candidate modules are given by the identifiable subgraphs that arise during this process. An isolation index is computed for each module, which quantifies the statistical significance of its survival time. CONCLUSIONS: The Miso (module isolation) method predicts gene modules from genomic data and the associated isolation index provides a module-specific measure of confidence. Improving on existing alternative, such as graph clustering and the global pruning of dendrograms, this index offers two intuitively appealing features: (1) the score is module-specific; and (2) different choices of threshold correlate logically with the resulting performance, i.e. a stringent cutoff yields high quality predictions, but low sensitivity. Through the analysis of yeast phenotype data, the Miso method is shown to outperform existing alternatives, in terms of the specificity and sensitivity of its predictions

    Single Feature Polymorphism Discovery in Rice

    Get PDF
    The discovery of nucleotide diversity captured as single feature polymorphism (SFP) by using the expression array is a high-throughput and effective method in detecting genome-wide polymorphism. The efficacy of such method was tested in rice, and the results presented in the paper indicate high sensitivity in predicting SFP. The sensitivity of polymorphism detection was further demonstrated by the fact that no biasness was observed in detecting SFP with either single or multiple nucleotide polymorphisms. The high density SFP data that can be generated quite effectively by the current method has promise for high resolution genetic mapping studies, as physical location of features are well-defined on rice genome

    The Statistics of Bulk Segregant Analysis Using Next Generation Sequencing

    Get PDF
    We describe a statistical framework for QTL mapping using bulk segregant analysis (BSA) based on high throughput, short-read sequencing. Our proposed approach is based on a smoothed version of the standard statistic, and takes into account variation in allele frequency estimates due to sampling of segregants to form bulks as well as variation introduced during the sequencing of bulks. Using simulation, we explore the impact of key experimental variables such as bulk size and sequencing coverage on the ability to detect QTLs. Counterintuitively, we find that relatively large bulks maximize the power to detect QTLs even though this implies weaker selection and less extreme allele frequency differences. Our simulation studies suggest that with large bulks and sufficient sequencing depth, the methods we propose can be used to detect even weak effect QTLs and we demonstrate the utility of this framework by application to a BSA experiment in the budding yeast Saccharomyces cerevisiae
    corecore